Arta
max
Weintroduce asimple butgeneral online learning frameworkinwhich alearner plays against an adversary in a vector-valued game that changes every round. Even though the learner'sobjectiveis not convex-concave(and so the minimax theorem does not apply), we giveasimple algorithm that can compete with the setting in which the adversary must announce their action first, with optimally diminishing regret.
Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization
Ribeiro, Antônio H., Vävinggren, David, Zachariah, Dave, Schön, Thomas B., Bach, Francis
Adversarial training has emerged as a key technique to enhance model robustness against adversarial input perturbations. Many of the existing methods rely on computationally expensive min-max problems that limit their application in practice. We propose a novel formulation of adversarial training in reproducing kernel Hilbert spaces, shifting from input to feature-space perturbations. This reformu-lation enables the exact solution of inner maximization and efficient optimization. It also provides a regularized estimator that naturally adapts to the noise level and the smoothness of the underlying function. We establish conditions under which the feature-perturbed formulation is a relaxation of the original problem and propose an efficient optimization algorithm based on iterative kernel ridge regression. We provide generalization bounds that help to understand the properties of the method. We also extend the formulation to multiple kernel learning. Empirical evaluation shows good performance in both clean and adversarial settings.
- North America > United States (0.14)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- Asia > Middle East > Jordan (0.04)
- (5 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
Kumar, Raghvendra, Salman, S. A. Mohammed, Sahu, Aryan, Nandi, Tridib, P., Pragathi Y., Saha, Sriparna, Moreno, Jose G.
Despite progress in comment-aware multimodal and multilingual summarization for English and Chinese, research in Indian languages remains limited. This study addresses this gap by introducing COSMMIC, a pioneering comment-sensitive multimodal, multilingual dataset featuring nine major Indian languages. COSMMIC comprises 4,959 article-image pairs and 24,484 reader comments, with ground-truth summaries available in all included languages. Our approach enhances summaries by integrating reader insights and feedback. We explore summarization and headline generation across four configurations: (1) using article text alone, (2) incorporating user comments, (3) utilizing images, and (4) combining text, comments, and images. To assess the dataset's effectiveness, we employ state-of-the-art language models such as LLama3 and GPT-4. We conduct a comprehensive study to evaluate different component combinations, including identifying supportive comments, filtering out noise using a dedicated comment classifier using IndicBERT, and extracting valuable insights from images with a multilingual CLIP-based classifier. This helps determine the most effective configurations for natural language generation (NLG) tasks. Unlike many existing datasets that are either text-only or lack user comments in multimodal settings, COSMMIC uniquely integrates text, images, and user feedback. This holistic approach bridges gaps in Indian language resources, advancing NLP research and fostering inclusivity.
- Asia > Pakistan (0.04)
- Africa > Middle East > Djibouti > Arta > `Arta (0.04)
- North America > United States > Oregon > Multnomah County > Portland (0.04)
- (13 more...)
- Media > News (0.67)
- Health & Medicine (0.46)
America's Golden Dome can't wait
In response to an executive order, President Donald Trump's team will present him with a plan for creating the Golden Dome, a missile defense shield meant to guard against attacks that are increasingly difficult to defeat. This effort will demand innovative thinking, collective will and rapid action. Since my tenure as director of the Missile Defense Agency in the early 2000s, an integrated network of sensors based in space, land and sea paired with ground-based interceptors has effectively deterred rudimentary missile attacks on our homeland from Iran, North Korea and others. But as they continue to improve their capabilities and as we look at a resurgent Russia and aggressive China, we need to build our next-generation missile defense. The window to defeat ballistic missiles heading to targets in the US is less than 40 minutes and can be as brief as 10 or 15 minutes if launched from a submarine closer to its target.
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Revisiting Noise in Natural Language Processing for Computational Social Science
Computational Social Science (CSS) is an emerging field driven by the unprecedented availability of human-generated content for researchers. This field, however, presents a unique set of challenges due to the nature of the theories and datasets it explores, including highly subjective tasks and complex, unstructured textual corpora. Among these challenges, one of the less well-studied topics is the pervasive presence of noise. This thesis aims to address this gap in the literature by presenting a series of interconnected case studies that examine different manifestations of noise in CSS. These include character-level errors following the OCR processing of historical records, archaic language, inconsistencies in annotations for subjective and ambiguous tasks, and even noise and biases introduced by large language models during content generation. This thesis challenges the conventional notion that noise in CSS is inherently harmful or useless. Rather, it argues that certain forms of noise can encode meaningful information that is invaluable for advancing CSS research, such as the unique communication styles of individuals or the culture-dependent nature of datasets and tasks. Further, this thesis highlights the importance of nuance in dealing with noise and the considerations CSS researchers must address when encountering it, demonstrating that different types of noise require distinct strategies.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Poland (0.14)
- Europe > Finland (0.14)
- (130 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- (2 more...)
- Media > News (1.00)
- Leisure & Entertainment (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- (10 more...)
Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks
Bechler-Speicher, Maya, Finkelshtein, Ben, Frasca, Fabrizio, Müller, Luis, Tönshoff, Jan, Siraudin, Antoine, Zaverkin, Viktor, Bronstein, Michael M., Niepert, Mathias, Perozzi, Bryan, Galkin, Mikhail, Morris, Christopher
While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
Training and Evaluating with Human Label Variation: An Empirical Study
Kurniawan, Kemal, Mistica, Meladel, Baldwin, Timothy, Lau, Jey Han
Human label variation (HLV) challenges the standard assumption that an example has a single ground truth, instead embracing the natural variation in human labelling to train and evaluate models. While various training methods and metrics for HLV have been proposed, there has been no systematic meta-evaluation of HLV evaluation metrics, contributing to the lack of clarity in the best HLV training method. We propose new evaluation metrics and training methods and empirically meta-evaluate HLV evaluation metrics. We find that training on either disaggregated annotations or soft labels often performs best across metrics, and that our proposed soft metric correlates best with human preference.
- Oceania > Australia (0.14)
- Europe > United Kingdom (0.05)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- (2 more...)
Normative Evaluation of Large Language Models with Everyday Moral Dilemmas
Sachdeva, Pratik S., van Nuenen, Tom
The rapid adoption of large language models (LLMs) has spurred extensive research into their encoded moral norms and decision-making processes. Much of this research relies on prompting LLMs with survey-style questions to assess how well models are aligned with certain demographic groups, moral beliefs, or political ideologies. While informative, the adherence of these approaches to relatively superficial constructs tends to oversimplify the complexity and nuance underlying everyday moral dilemmas. We argue that auditing LLMs along more detailed axes of human interaction is of paramount importance to better assess the degree to which they may impact human beliefs and actions. To this end, we evaluate LLMs on complex, everyday moral dilemmas sourced from the "Am I the Asshole" (AITA) community on Reddit, where users seek moral judgments on everyday conflicts from other community members. We prompted seven LLMs to assign blame and provide explanations for over 10,000 AITA moral dilemmas. We then compared the LLMs' judgments and explanations to those of Redditors and to each other, aiming to uncover patterns in their moral reasoning. Our results demonstrate that large language models exhibit distinct patterns of moral judgment, varying substantially from human evaluations on the AITA subreddit. LLMs demonstrate moderate to high self-consistency but low inter-model agreement. Further analysis of model explanations reveals distinct patterns in how models invoke various moral principles. These findings highlight the complexity of implementing consistent moral reasoning in artificial systems and the need for careful evaluation of how different models approach ethical judgment. As LLMs continue to be used in roles requiring ethical decision-making such as therapists and companions, careful evaluation is crucial to mitigate potential biases and limitations.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (4 more...)
- Health & Medicine (0.68)
- Media > News (0.59)
Reddit is all you need: Authorship profiling for Romanian
Ştefănescu, Ecaterina, Jerpelea, Alexandru-Iulius
Authorship profiling is the process of identifying an author's characteristics based on their writings. This centuries old problem has become more intriguing especially with recent developments in Natural Language Processing (NLP). In this paper, we introduce a corpus of short texts in the Romanian language, annotated with certain author characteristic keywords; to our knowledge, the first of its kind. In order to do this, we exploit a social media platform called Reddit. We leverage its thematic community-based structure (subreddits structure), which offers information about the author's background. We infer an user's demographic and some broad personal traits, such as age category, employment status, interests, and social orientation based on the subreddit and other cues. We thus obtain a 23k+ samples corpus, extracted from 100+ Romanian subreddits. We analyse our dataset, and finally, we fine-tune and evaluate Large Language Models (LLMs) to prove baselines capabilities for authorship profiling using the corpus, indicating the need for further research in the field. We publicly release all our resources.
- Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.05)
- South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
- Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
- (14 more...)
Semantic-Guided RL for Interpretable Feature Engineering
Bouadi, Mohamed, Alavi, Arta, Benbernou, Salima, Ouziri, Mourad
The quality of Machine Learning (ML) models strongly depends on the input data, as such generating high-quality features is often required to improve the predictive accuracy. This process is referred to as Feature Engineering (FE). However, since manual feature engineering is time-consuming and requires case-by-case domain knowledge, Automated Feature Engineering (AutoFE) is crucial. A major challenge that remains is to generate interpretable features. To tackle this problem, we introduce SMART, a hybrid approach that uses semantic technologies to guide the generation of interpretable features through a two-step process: Exploitation and Exploration. The former uses Description Logics (DL) to reason on the semantics embedded in Knowledge Graphs (KG) to infer domain-specific features, while the latter exploits the knowledge graph to conduct a guided exploration of the search space through Deep Reinforcement Learning (DRL). Our experiments on public datasets demonstrate that SMART significantly improves prediction accuracy while ensuring a high level of interpretability.
- Europe > Greece > Attica > Athens (0.05)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > New York (0.04)
- Africa > Middle East > Djibouti > Arta > `Arta (0.04)